{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Project: Benchmarking Sorting Algorithms" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Pat McDonald G00281051" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Introduction\n", "\n", "\n", "As humans, we developed the capacity to sort objects and data in order to arrange such these, and to be able to search for them more efficiently. In computer science, it is algorithms that enable computers to perform sorting tasks exponentially faster than our human brains ever can. But, not all sorting algorithms are equal. And in this project, I shall attempt to demonstrate this.\n", "\n", "\n", "Time and Space complexity\n", "\n", "In analysis of an algorithm. Its time complexity is the amount of time required by the algorithm to run. In measuring this, the worst case is generally used. That is, the maximum execution time with a given set of inputs to the algorithm.\n", "https://en.wikipedia.org/wiki/Time_complexity\n", "\n", "Space complexity\n", "\n", "This is the amount of computer memory required by the algorithm to run to completion;\n", "https://en.wikipedia.org/wiki/Space_complexity\n", "\n", "\n", "Performance\n", "\n", "Performance is measured by the amount of time and system resources used by an algorithm in its implementation.\n", "\n", "Complexity is classified with regrards to the order of growth in complexity:\n", "\n", "Image source: http://bigocheatsheet.com/\n", "\n", "<img src = BigO.png>\n", "\n", "\n", "Image source: https://medium.com/@george.seif94/a-tour-of-the-top-5-sorting-algorithms-with-python-code-43ea9aa02889\n", "\n", "<img src = time_space.png>\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " ### 1. Bubble Sort" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Bubble Sort diagram from https://interactivepython.org/runestone/static/pythonds/SortSearch/TheBubbleSort.html\n", "<img src=\"bubblepass.png\">" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "https://en.wikipedia.org/wiki/Bubble_sort\n", "\n", "The Bubble sort is considered the most basic sorting algorithm. It's a simple comparison based algorithm in which a list is iterated through, and adjacent pairs of data elements swapped until the list is sorted in order.\n", "This process is considered too inefficient to be used extensively in software development." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[17, 20, 26, 31, 44, 54, 55, 77, 93]\n" ] } ], "source": [ "#Code source: https://interactivepython.org/runestone/static/pythonds/SortSearch/TheBubbleSort.html\n", "\n", "def bubbleSort(alist):\n", " for passnum in range(len(alist)-1,0,-1):\n", " for i in range(passnum):\n", " if alist[i]>alist[i+1]:\n", " temp = alist[i]\n", " alist[i] = alist[i+1]\n", " alist[i+1] = temp\n", "\n", "alist = [54,26,93,17,77,31,44,55,20]\n", "bubbleSort(alist)\n", "print(alist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. Merge Sort" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Images source: https://interactivepython.org/runestone/static/pythonds/SortSearch/TheMergeSort.html\n", "\n", "<img src = mergesortA.png>\n", "\n", "<img src = mergesortB.png>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "https://www.studytonight.com/data-structures/merge-sort\n", "\n", "Merge Sort works in a \"divide and conquer\" method. The data inputed (as an array) is iteratively divided into smaller sub-arrays. These sub-arrays are sorted, and then merged to form the sorted array." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Splitting [54, 26, 93, 17, 77, 31, 44, 55, 20]\n", "Splitting [54, 26, 93, 17]\n", "Splitting [54, 26]\n", "Splitting [54]\n", "Merging [54]\n", "Splitting [26]\n", "Merging [26]\n", "Merging [26, 54]\n", "Splitting [93, 17]\n", "Splitting [93]\n", "Merging [93]\n", "Splitting [17]\n", "Merging [17]\n", "Merging [17, 93]\n", "Merging [17, 26, 54, 93]\n", "Splitting [77, 31, 44, 55, 20]\n", "Splitting [77, 31]\n", "Splitting [77]\n", "Merging [77]\n", "Splitting [31]\n", "Merging [31]\n", "Merging [31, 77]\n", "Splitting [44, 55, 20]\n", "Splitting [44]\n", "Merging [44]\n", "Splitting [55, 20]\n", "Splitting [55]\n", "Merging [55]\n", "Splitting [20]\n", "Merging [20]\n", "Merging [20, 55]\n", "Merging [20, 44, 55]\n", "Merging [20, 31, 44, 55, 77]\n", "Merging [17, 20, 26, 31, 44, 54, 55, 77, 93]\n", "[17, 20, 26, 31, 44, 54, 55, 77, 93]\n" ] } ], "source": [ "#Code source: https://interactivepython.org/runestone/static/pythonds/SortSearch/TheMergeSort.html\n", "\n", "def mergeSort(alist):\n", " print(\"Splitting \",alist)\n", " if len(alist)>1:\n", " mid = len(alist)//2\n", " lefthalf = alist[:mid]\n", " righthalf = alist[mid:]\n", "\n", " mergeSort(lefthalf)\n", " mergeSort(righthalf)\n", "\n", " i=0\n", " j=0\n", " k=0\n", " while i < len(lefthalf) and j < len(righthalf):\n", " if lefthalf[i] < righthalf[j]:\n", " alist[k]=lefthalf[i]\n", " i=i+1\n", " else:\n", " alist[k]=righthalf[j]\n", " j=j+1\n", " k=k+1\n", "\n", " while i < len(lefthalf):\n", " alist[k]=lefthalf[i]\n", " i=i+1\n", " k=k+1\n", "\n", " while j < len(righthalf):\n", " alist[k]=righthalf[j]\n", " j=j+1\n", " k=k+1\n", " print(\"Merging \",alist)\n", "\n", "alist = [54,26,93,17,77,31,44,55,20]\n", "mergeSort(alist)\n", "print(alist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3. Counting Sort" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Image source: https://www.w3resource.com/python-exercises/data-structures-and-algorithms/python-search-and-sorting-exercise-10.php\n", "\n", "<img src = countingsort.png>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "https://www.geeksforgeeks.org/counting-sort/\n", "https://en.wikipedia.org/wiki/Counting_sort\n", "\n", "Counting Sort is a distribution sort algorithm (stable)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 1, 1, 2, 2, 2, 2, 3, 3, 4, 7]\n" ] } ], "source": [ "# Code source: \n", "# https://www.w3resource.com/python-exercises/data-structures-and-algorithms/python-search-and-sorting-exercise-10.php\n", "\n", "def counting_sort(array1, max_val):\n", " m = max_val + 1\n", " count = [0] * m \n", " \n", " for a in array1:\n", " # count occurences\n", " count[a] += 1 \n", " i = 0\n", " for a in range(m): \n", " for c in range(count[a]): \n", " array1[i] = a\n", " i += 1\n", " return array1\n", "\n", "print(counting_sort( [1, 2, 7, 3, 2, 1, 4, 2, 3, 2, 1], 7 ))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4. Quick Sort\n", "\n", "Image source: https://www.hackerrank.com/challenges/quicksort2/problem\n", "\n", "<img src = quicksort.png>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "https://en.wikipedia.org/wiki/Quicksort\n", "\n", "Quick Sort is comparison sort algorithm that utilises a pivot element to initially partition data elements before sorting." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sorted array is:\n", "1\n", "2\n", "3\n", "5\n", "7\n", "8\n", "9\n" ] } ], "source": [ "# Code source: https://www.geeksforgeeks.org/quick-sort/\n", " \n", "# This function takes last element as pivot, places \n", "# the pivot element at its correct position in sorted \n", "# array, and places all smaller (smaller than pivot) \n", "# to left of pivot and all greater elements to right \n", "# of pivot \n", "def partition(arr,low,high): \n", " i = ( low-1 ) # index of smaller element \n", " pivot = arr[high] # pivot \n", " \n", " for j in range(low , high): \n", " \n", " # If current element is smaller than or \n", " # equal to pivot \n", " if arr[j] <= pivot: \n", " \n", " # increment index of smaller element \n", " i = i+1 \n", " arr[i],arr[j] = arr[j],arr[i] \n", " \n", " arr[i+1],arr[high] = arr[high],arr[i+1] \n", " return ( i+1 ) \n", " \n", "# The main function that implements QuickSort \n", "# arr[] --> Array to be sorted, \n", "# low --> Starting index, \n", "# high --> Ending index \n", " \n", "# Function to do Quick sort \n", "def quickSort(arr,low,high): \n", " if low < high: \n", " \n", " # pi is partitioning index, arr[p] is now \n", " # at right place \n", " pi = partition(arr,low,high) \n", " \n", " # Separately sort elements before \n", " # partition and after partition \n", " quickSort(arr, low, pi-1) \n", " quickSort(arr, pi+1, high) \n", " \n", "# Driver code to test above \n", "arr = [5, 8, 1, 3, 7, 9, 2] \n", "n = len(arr) \n", "quickSort(arr,0,n-1) \n", "print (\"Sorted array is:\") \n", "for i in range(n): \n", " print (\"%d\" %arr[i]), \n", " \n", "# This code is contributed by Mohit Kumra" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5. Selection Sort\n", "\n", "Image source: https://interactivepython.org/runestone/static/pythonds/SortSearch/TheSelectionSort.html\n", "\n", "<img src = selectionsort.png>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Quote reference:https://www.geeksforgeeks.org/selection-sort/\n", "\n", "The selection sort algorithm sorts an array by repeatedly finding the minimum element (considering ascending order) from unsorted part and putting it at the beginning. The algorithm maintains two subarrays in a given array.\n", "\n", "1) The subarray which is already sorted.\n", "2) Remaining subarray which is unsorted.\n", "\n", "In every iteration of selection sort, the minimum element (considering ascending order) from the unsorted subarray is picked and moved to the sorted subarray." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[17, 20, 26, 31, 44, 54, 55, 77, 93]\n" ] } ], "source": [ "#Code source https://interactivepython.org/runestone/static/pythonds/SortSearch/TheSelectionSort.html\n", "\n", "def selectionSort(alist):\n", " for fillslot in range(len(alist)-1,0,-1):\n", " positionOfMax=0\n", " for location in range(1,fillslot+1):\n", " if alist[location]>alist[positionOfMax]:\n", " positionOfMax = location\n", "\n", " temp = alist[fillslot]\n", " alist[fillslot] = alist[positionOfMax]\n", " alist[positionOfMax] = temp\n", "\n", "alist = [54,26,93,17,77,31,44,55,20]\n", "selectionSort(alist)\n", "print(alist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Timing code sample below" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Local current time : Mon May 13 16:49:19 2019\n" ] } ], "source": [ "# From Project specification .pdf\n", "#import time library\n", "import time\n", "\n", "start_time = time.time()\n", "# Call your sorting algorithm here\n", "\n", "end_time = time.time()\n", "time_elapsed = end_time - start_time\n", "\n", "# local time\n", "localtime = time.asctime( time.localtime(time.time()) )\n", "print (\"Local current time :\", localtime)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Random Array code sample below" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# From Project specification .pdf\n", "# from random import randint\n", "def random_array(n):\n", " array = []# create array\n", " for i in range (0, n, 1):# \n", " array.append(randint(0, 100))# adds random integers between 0 and 100 to array\n", " return array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Benchmarking" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Import python modules\n", "# Code is from project specification.pdf\n", "import time\n", "from random import randint\n", "\n", "def random_array(n):\n", " array = []# create array\n", " for i in range (0, n, 1):# \n", " array.append(randint(0, 100))# adds random integers between 0 and 100 to array\n", " return array\n", "\n", "start_time = time.time()\n", "# Call your sorting algorithm here\n", "\n", "\n", "end_time = time.time()\n", "time_elapsed = end_time - start_time" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }